Observations from the basic profile report

Compared to earlier, these values are virtually unchanged. So we can proceed.

Value distribution of categorical columns.

Observation on Age

Observation on Experience

Observation on Income

Observation on log(Income)

Observation on ZIPCode

Observation on Family

Observation on CCAvg

Observation on log(CCAvg)

Observation on Education

Observation on Mortgage

Observation on Mortgage > 0

Observation on Personal_Loan (Target Variable)

Observation on Securities_Account

Observation on CD_Account

Observation on Online

Observation on CreditCard

Bivariate Analysis

Observations on correlations

Count plots involving two attributes

log(Income) vs Age vs Personal_Loan

log(CCAvg) vs Age vs Personal_Loan

log(Mortgage) vs Age vs Personal_Loan

log(Income) vs ZIPCode vs Personal_Loan

log(CCAvg) vs ZIPCode vs Personal_Loan

log(Mortgage) vs ZIPCode vs Personal_Loan

Outliers treatment

Model building starts here

Top 5 features that predict the target variable are:

The above are much better (and balanced) numbers compared to what we got with the first model before optimizing.

The code below takes a while to run. This is expected.

We sacrificed the Precision a bit, but improved Recall and F1 measures and brought them up.

Decision Tree

Top 5 important features based on Decision Tree model

Observations from the tree:

Using the above extracted decision rules we can make interpretations from the decision tree model like:

Conclusions and Summary

Recommendations